feat(serve): improve OpenAI API compatibility with usage, finish_reas…#771
Conversation
…on, and system_fingerprint Add missing OpenAI API fields to m serve chat completion responses: - Add `finish_reason` field to Choice model (defaults to "stop") - Add `usage` field with token counts (prompt_tokens, completion_tokens, total_tokens) extracted from ModelOutputThunk.usage when available - Add `system_fingerprint` field populated from model or provider metadata - Fix bug in model_options extraction (use model_dump().items() instead of iterating request) Includes comprehensive test coverage with 13 unit tests verifying all new fields and edge cases (missing data, partial data, fallback behavior). Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
|
The PR description has been updated. Please fill out the template for your PR to be reviewed. |
Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
|
I used Claude to do a PR review, here are a few highlight you may want to look into: Issues
Minor / Nits
|
system_fingerprint should be a backend config hash, not the model name. The model name is already in response.model. Set to None since we don't currently track backend config fingerprints. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
When partial usage data is provided (e.g., only prompt_tokens), total_tokens was incorrectly defaulting to 0. Now it's calculated as prompt_tokens + completion_tokens when not explicitly provided. This prevents silent bad values from going out over the OpenAI-compatible API. - Modified cli/serve/app.py to calculate total_tokens when missing - Updated test_usage_with_partial_data to expect correct behavior Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
Replace hardcoded length assertion with implementation-agnostic check. The test now validates that the ID has the correct prefix and a non-empty suffix, without coupling to specific length or format details. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
I never got that error in precommit. 🤷♂️ Updated with fixes for the comments |
ajbozarth
left a comment
There was a problem hiding this comment.
Claude and I agree this LGTM, might want one more approval before merge for sanity
Misc PR
Type of PR
Description
feat(serve): improve OpenAI API compatibility with usage, finish_reason, and system_fingerprint
Add missing OpenAI API fields to m serve chat completion responses:
finish_reasonfield to Choice model (defaults to "stop")usagefield with token counts (prompt_tokens, completion_tokens, total_tokens)extracted from ModelOutputThunk.usage when available
system_fingerprintfield populated from model or provider metadataIncludes comprehensive test coverage with 13 unit tests verifying all new fields
and edge cases (missing data, partial data, fallback behavior).
Testing